Runbook: Slack Notification – Empty Scraping Result
| Service | Review Management x Tripla Review – Webscraper |
|---|---|
| Owner Team slack handle | @bnl-dev-bali |
| Team's Slack Channel | #bnl-teams-b |
| Alert Channel | #bnl-products-alerts |
Table of Contents
- [[#Important Links]]
- [[#1. Triage]]
- [[#2. Decision Point]]
- [[#3. False Alarm]]
- [[#4. True Incident]]
- [[#4.1. Recover the System]]
- [[#4.2. Clean up]]
Important Links
| Alert | Slack Notification – Empty Scraping Result |
|---|---|
| Webscraper Dashboard | Webscraper Dashboard URL |
| Reviewku Portal | Reviewku Portal URL |
| Reviewku Worker Logs | Worker Service Logs URL |
| Review API Endpoint | https://review-api.bookandlink.com/h/ws/notif |
1. Triage
Goal: Determine whether scraping failed due to captcha blocking, invalid start URL, or webhook/queue failure.
Step A - Validate Scraping Job
- Login to Webscraper dashboard.
- Go to Jobs menu.
- Type Property ID in search field.
- Click Inspect.
- Open Details tab.
Check if any of these values are greater than 0:
- Failed pages
- Empty pages
- No value pages
If any value > 0 → scraping issue confirmed.
Step B - Identify Specific Page Issue
If Failed Pages > 0:
- Click Failed Pages tab.
- Inspect culprit page.
- Most common cause: Captcha blocking.
If Empty Pages > 0 or No Value Pages > 0:
- Click respective tab.
- Click Preview.
- Check HTTP response (example:
404).
Step C - Validate Queue Processing
Go to Reviewku Worker Service Logs.
Search by Property ID.
If you find:
Message received: map[channel:traveloka event:download-scraped-data ...]
SUCCESS: Successfully inserted X reviews with property ID Y into the main table.
Then scraping result was processed successfully.
If no log found → webhook or queue issue.
2. Decision Point
-
IF Failed Pages > 0 and captcha detected...
- ➡️ **Go to: [[#4. True Incident]]
-
IF Empty Pages caused by 404 or invalid channel URL...
- ➡️ **Go to: [[#4. True Incident]]
-
IF Webhook not delivered or no worker logs found...
- ➡️ Go to: [[#4. True Incident]]
-
IF scraping valid and worker inserted successfully...
- ➡️ **Go to: [[#4. True Incident]]
3. False Alarm
If:
- Scraping values are 0
- Worker logs show successful insertion
- No new scraping errors
Then Slack alert may be outdated or previously resolved.
Actions:
- Refresh Webscraper job.
- Re-check review count.
- Monitor for 15 minutes.
Post in Slack:
Empty Scraping Result alert reviewed.
Scraping and queue processing verified.
No active issue detected.
4. True Incident
Scraping job failed or result not processed.
Primary objective: Restore successful scraping and review insertion.
4.1. Recover the System
Case 1 - Captcha Blocking
Diagnostic Steps
- Failed Pages > 0
- Page preview shows captcha challenge
Remediation Plan
- Click Inspect.
- Go to Continue tab.
- Change proxy.
- Click Continue Scraping.
Recommended proxies:
- US
- Indonesia
- Australia
- Japan
Switch between these if needed.
If none work:
- Try nearest alternative proxy once.
- If still failed → report to Webscraper team.
Verification
- Failed Pages = 0
- Data preview contains valid reviews.
Case 2 - Empty Pages / Invalid Channel URL
Diagnostic Steps
- Empty Pages > 0
- Preview shows 404 or invalid page
Remediation Plan
- Go to official channel website (example: Expedia).
- Search property manually.
- Copy correct official property URL.
Example:
https://www.expedia.com.ph/Manila-Hotels-Red-Planet-Manila-Amorsolo
- Login to Portal.
- Navigate:
Reviewku → Properties → Edit - Under Channels, click settings (gear).
- Update channel URL.
- Click Save.
- Click Fetch New under respective channel tab.
Verification
- New scraping job created.
- Data preview valid.
- Failed pages = 0.
Case 3 - Webhook Not Delivered
Diagnostic Steps
- No SUCCESS log in Worker service.
- Webhook logs show non-200 status.
Remediation Plan
Re-trigger webhook manually:
curl -X POST 'https://review-api.bookandlink.com/h/ws/notif' \
-A 'webscraper.io/v1' \
-H 'Content-Type: application/x-www-form-urlencoded' \
-H 'Signature: faebe19220e9cb22d40e9280e4083ae4f1749dc9a84267fbadd44096cf88cbbc' \
-d 'scrapingjob_id={scrapingJobID}&status=finished&sitemap_id={sitemapID}&sitemap_name={sitemapName}&custom_id={channelName}-{propID}-production'
Values found at:
Jobs → Scraping Job Details → Detail tab
scrapingJobIDsitemapIDsitemap_name
Expected response:
{"action taken":"Processing Data","custom id":"expedia-64-production","job id":"39140478","site map id":"953592","status":"finished"}
Verification
- Worker logs show successful insertion.
- Review count updated.
4.2. Clean up
- Confirm review count updated in Reviewku dashboard.
- Confirm no duplicate insertions.
- Monitor worker logs for 30 minutes.
- Post Slack resolution update:
Empty Scraping Result resolved.
Scraping job reprocessed successfully.
Review synchronization restored.
- If captcha recurring frequently, escalate to Webscraper team for long-term mitigation.